Skip to content

Conversation

@Jan-Kazlouski-elastic
Copy link
Contributor

This PR adds changes to specification caused by elastic/elasticsearch#132388

Additional actions

  • Signed the CLA
  • Executed make contrib

@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

Following you can find the validation changes against the target branch for the API.

API Status Request Response
inference.put_nvidia ➕ ⚪ Missing test Missing test

You can validate this API yourself by using the make validate target.

# Conflicts:
#	output/openapi/elasticsearch-openapi.json
#	output/openapi/elasticsearch-serverless-openapi.json
#	output/schema/schema.json
},
"dependencies": {
"@redocly/cli": "^1.34.5"
"@redocly/cli": "^1.34.6"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this should be getting changed here.

Comment on lines +23 to +26
"rerank",
"text_embedding",
"completion",
"chat_completion"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, but could these be in alphabetical order?

*/
model_id: string
/**
* For a `text_embedding` task, the maximum number of tokens per input before chunking occurs.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be "For a `text_embedding` task, the maximum number of tokens per input. Inputs exceeding this value are truncated prior to sending to the Nvidia API."

This is wrong almost everywhere in the docs; there's an issue describing some of the problems with max_input_tokens.

Comment on lines +1849 to +1852
text_embedding,
completion,
chat_completion,
rerank
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, could these be in alphabetical order?

*/
input_type?: NvidiaInputType
/**
* For a `text_embedding` task, the method to handle inputs longer than the maximum token length.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To help differentiate this from max_input_tokens it might be better to word it like "the method used by the Nvidia model to handle inputs longer than..."

Comment on lines +1818 to +1820
/**
* The URL of the Nvidia model endpoint.
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be helpful to include the default URLs for each task type if url isn't specified?

Comment on lines +144 to +147
text_embedding,
chat_completion,
completion,
rerank
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, could these be in alphabetical order?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ml skip-backport This pull request should not be backported specification Team:ML

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants